Deep Learning Visuals

Over 200 figures and diagrams of the most popular deep learning architectures and layers FREE TO USE in your blog posts, slides, presentations, or papers.

Home | Activation Functions | Architectures | Assorted | Attention | Batch Norm | BERT | Classification | Convolutions | Decoder | Dropout | ELMo | Encoder | FFN | Gradient Descent | Initializations and Clipping | Layer Norm | Optimizers and Schedulers | Patch Embeddings | Positional Encoding | RNNs | Seq2Seq | Transformers

Initialization Schemes and Gradient Clipping

Shield:

These images were originally published in the book “Deep Learning with PyTorch Step-by-Step: A Beginner’s Guide”.

They are also available at the book’s official repository: https://github.com/dvgodoy/PyTorchStepByStep.

Index

Back
Vanishing Gradients
Initialization Schemes
- Comparing against BatchNorm
Gradient Clipping

CLICK ON THE IMAGES FOR FULL SIZE

Papers

Xavier/Glorot Initialization: Understanding the difficulty of training deep feedforward neural networks by Glorot, X., Bengion, Y. (2010)
Kaiming/He Initialization: Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification by He, K. et al. (2015)

Vanishing Gradients

Source: Chapter Extra

Initialization Schemes

Source: Chapter Extra

Comparing against BatchNorm

Source: Chapter Extra

Gradient Clipping

Value Clipping

Source: Chapter Extra

Norm Clipping

Source: Chapter Extra

Using Hooks

Source: Chapter Extra

This work is licensed under a Creative Commons Attribution 4.0 International License.